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Three independent techniques are used to separate fine structure from the absorption spectra, 
c 2 ^ ' the background function in which is approximated by 

I (i) smoothing sphne. We propose a new reliable criterion for determination of smoothing parameter 

■ and the method for raising of stability with respect to fcmin variation; 

(ii) interpolation spline with the varied knots; 
^ , (iii) the line obtained from bayesian smoothing. This methods considers various prior information 

and includes a natural way to determine the errors of XAFS extraction. 

Particular attention has been given to the estimation of uncertainties in XAFS data. Experimental 
noise is shown to be essentially smaller than the errors of the background approximation, and it is 
the latter that determines the variances of structural parameters in subsequent fitting. 
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I. INTRODUCTION 



> 

. ■ X-ray-absorption fine-structure (XAFS), X: is determined by [1]: 
O x{E) = HE) ~ ME)]/H{E) - fibiE)], (1) 

• i-H ■ 

' where fi is the measured absorption, fiQ is the "atomic" absorption due to electrons of considered atomic level, /if, 
^ r^ ', is tii6 absorption of other processes. Since the electronic state of an embedded atom is, in general, different from 
O • its state in gaseous phase, /xq is not the same as for isolated atom and cannot be found experimentally. Therefore a 
I— 1[ demand arises for an artificial construction of fiQ. 

Usually, /ifc is approximated by a Victoreen polynomial P — aE^^ + bE~'^ [1] or by a more general polynomial P, 
1^ ' coefficients of which are found by the least squares method from fJ.{E) — P{E) at energies lower than the edge. 
\^ i Further, energy dependence is transformed to the photoelectron wave number dependence: k — ^j2m^(E — Ei^jh^ 
OO ' where Eq is the energy of the corresponding absorption edge. Usually, to the Eq the energy at half the step is assigned 
1 or the energy of inflection point of /i(-E). In most practical works the deviation of £^o from true value, Ai?o, is one of 
CO ' the fitting parameters. 

The most difficult procedure in extracting of XAFS from the measured absorption is the construction of /iq since 
one cannot definitely distinguish the environmental-born part of absorption from the atomic-like one. All methods for 
determination of the post-edge background are based on the assumption of its smoothness, and the only criterion for 
its validity is the absence of low-frequncy structure in x(fc) , i- e. the small absolute value of the Fourier transform 
(FT) p(r) at low r. The review of existing post-edge background methods and the propositions of some new is the 
main purpose of the article. 
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(-H ' Special attention must be paid to the estimation of noise and uncertainties in XAFS data. Experimental noise is 
shown to be essentially smaller than the errors of the background approximation, and it is the latter that determines 
;V • the variances of structural parameters in subsequent fitting. The corresponding section of the present article is closely 
. !^ \ related with the next article devoted to the determining the errors of structural parameters [2] . 

■ All described in the article methods for background removal, its error estimations, and XAFS- function corrections 
?H ' are realized in the freeware program viper [3] which allows one to vary several parameters by hand and watch the 
. 5^ , results simultaneously. 

II. METHODS OF /io CONSTRUCTION 
A. Smoothing spline 

Owing to fast algorithm and easy program realization, the approximation of /io by the smoothing spline has become 
widespread. Let N A-\ experimental values of [li are defined on the mesh Ei. The smoothing spline /io minimizes the 
functional 
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The smoothing parameter (or regularizer) a is the measure of compromise between smoothness of fj,o and its deviation 
from fj,. At a = the smoothing sphne exactly coincides with /x, at a ^ oo it degenerates to /io = const. Optimal 
regularizer should lead to containing only low-frequency oscillations and, hence, to x containing only structural 
oscillations. The formulation of a new criterion for optimal a wc shall consider below. 

First, we address another problem, the well-known spline instability with respect to the small variations of input 
parameters: number of nodes, nodal values of the processed function, and limits on integral. In our case the spline 
is most sensitive to -Bmin due to fast growth of /i in the edge. To raise the stability the method was put forward in 
VIPER program which lies in the use of a prior information specifying the shape of iJ-o{E) dependence. It is known 
in advance that the absorption edge without so-called white lino constitutes nearly smooth step; the white line, if 
presents, is added to the step. Denote this prior function as p{E). Now we will tend the second derivative of the 
sought Ho{E) not to zero (at the specified deviation of no from jj) but to the second derivative oip{E). The sought 
lio{E) is now minimizes the functional 

J*(Mo,m) = / K{E) - p"{E)f dE+-Y, [Moi - W]^ (3) 

As seen, in fact there is no need to know p{E) itself, its second derivative is sufficient. The explicit presence of p{E) 
in the following formulas should be taken as a consequence of the technical trick applied: at first p{E) is subtracted 
from the data, then it is added to the found spline. 

Represent the second derivatives in finite-difference approximation, introduce /xoi = Hoi — Pi, and denote Aj = 
Ei+i — Ei\ 
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FIG. 1. Extraction of XAFS from the measured absorption using the smoothing spline. Prior function p{E) for the atomic-like 
absorption is drawn by dots. Solid lines — ixo{E), x(fe) • fc^, and p(r) obtained with use of the prior function; dashed Unas — 
dittos without prior function. The regularizer a is the same for both cases. 
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FIG. 2. The p(r) peak heihts squared, H^, 
maximal in the indicated areas and average 
over the range < r < 0.5 A, as functions 
of a. To the right ax;is relates the second 
derivative of the first peak height squared 
with respect to In a. 



N ^ N+1 

J*(/io, m) = - Aoi(A7_\ + Ari) + /ioi+iAri]2 + - ^ [Aoi - (rt - Pi)? = J{iJ'0,fJ'i - Pi) 



i=0 



(4) 



Thus, the problem is reduced to the preceding one in which instead of initial data /Ltj the difference /Uj —pi is appeared. 
The sought /zo is found from the smooth jlo as /zoi = jloi + Pi- In Fig. 1 is shown an example of the atomic-like 
absorption approximation by the smoothing spline with and without the use of prior function^. Energy Eq was 
determined at half the step height. Here, we constructed p{E) in the following manner. Found the average value 
p, of n{E) in region 20 < E < 70 eV above the absorptance maximum. Moving from the beginning of spectrum, 
assign p = ii until fi > /l, further p = p,. Then p{E) was smoothed 5 times on 3 points. To perform the Fourier 
transform, was brought into the uniform scale with 6k = 0.03 and multiplied by a Kaiser-Bessel window 

with parameter A = 1.5. As seen, the use of p{E) has led to disappearance of the spurious peak on the absolute value 
of FT at r ~ 0.5 A. 

So far we have considered the atomic-like absorption to be a smooth function with no peculiarities. However, in 
some spectra /io itself has a fine structure [4,5] originating from resonance scattering within absorbing atom or from 
multi-electron transitions. If in these cases, based on theoretical calculations, experimental information, or empirical 
considerations, one can nearly indicate the location of peculiarities, their width and weight relatively to the step 
height, then one would readily construct the prior function p{E) and find the correct /iq. Instead of constant value 
above absorption edge, the prior function would have corresponding valleys and/or peaks. 

Let us now define the criterion for determination of smoothing parameter. An attempt to solve the problem was 
made in Ref. [6], where the requirement was proposed: Hp — Hm > 0.05Hm, where Hp is the average value of the 
weighted Fourier transform magnitude between and 0.25 A, Hm is the maximum value in the transform magnitude 
between 1 and 5 A, Hn is the average value of the transform magnitude between 9 and 10 A attributed to the noise. 
Obviously, that this criterion cannot pretend to the generality since depends on the weighting (op. cit., k^) and the 
relative contribution of noise and the first coordination shell into spectra. 

In the program viper wc have proposed another approach to the problem based on the consideration of heights 
of FT peaks as functions of regularizer a (see Fig. 2). On increasing a from zero, /xq starts to deviate from the 
experimental absorption n, p{r) is growing and then saturates, the peaks at larger r being saturated earlier. Clearly, 
that a should be determined by the first peak height since it is the last to saturate. Define the start of saturation 
on the minimum of second derivative of the first peak squared with respect to In a. Declare the value of a in the 
minimum to be optimal. It is seen that the increase of a from the optimal leads to unwanted rapid growth of p at 
low r. 

In the example in Fig. 1 the regularizer is optimized following our new criterion. 



^Here and hereafter for examples is used the spectrum at Bi L3 absorption edge in Bao.6Ko.4Bi03 at 50 K recorded in 
transmission mode at D-21 line (XAS-13) of DCI (LURE,Orsay, France) at positron beam energy 1.85 GeV and the average 
current ~ 250mA. Energy step — leV, counting time — Is. Energy resolution of the double-crystal Si [311] monochromator 
(detuned to reject 50% of the incident signal in order to minimise harmonic contamination) with a 0.4 mm slit was about 2-3 eV 
at 13keV. 
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Unfortunately, the method of smoothing spline does not include any approach to the estimations of uncertainties 
in the hq obtained, in contrast to the following two methods. 



B. Interpolation spline drawn through the varied knots 

The method was put forward in Ref. [7]. N knots are equally spaced in k space, through them an interpolation 
spline is drawn. The ordinates of the knots are varied to minimize p or |p — pst| in the chosen low-r region < r < rg, 
where pst is the absolute value of the FT of a "standard" Xst{k) ■ fc"", calculated or experimental. The number of 
knots must not exceed the value A^max = 2roAk/n + 1, [8] where Afc is the k range of useful data. In the Ref. [7] was 
asserted that one need to know the "standard" Xst(fc) • k'^ merely approximately since it used only to get an estimate 
of the leakage from the first shell to the region minimized. The strange thing is that having omitted the question on 
the accuracy of found knots (as we show below, rather poor), the authors of the cited work made a fine comparison 
between several theoretical models for x{k) calculations. 

In Fig. 1 is shown an example of the method application. Ordinates of the 13 knots (A'max = 13.2) were varied to 
minimize the difference p — Pst at < r < 1.05 A. The function Xst(fc) was calculated using feff6 program [9] (as 
was pointed above, a crude estimate is sufficient, so details omitted). In the minimized region the p{r) is somewhat 
better than that obtained by the previous method. However, at fc > 15 one can distinguish the obviously wrong 
behavior of xik) ■ k^, and the first peak on p(r) becomes quite distorted. 

Consider now the problem of the accuracy of knot positions yj, j = 1, . . . ,N m fitting p(r) to Pst(^)- As a figure of 
merit, the x^-statistics appears: 
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[P(r-m) - Ps%{r7n)? 



(5) 



where Um are the errors of p{rm)- It can be shown (detailed analysis see in the next article [2]) that under the 




FIG. 3. Extraction of XAFS from the measured absorption using the interpolation spline through the knots with varied 
ordinates. On (c) the Fourier transforms are shown for sought x{k) -k^ (solid), "standard" (dots), and obtained by the previous 
method (dashed). 
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FIG. 4. Errors of x{k) extraction. Solid line 

with open circles — by the method of interpolation 
spline drawn through the varied knots. Dots — by 
the method of bayesian smoothing without (a) and 
with (fe c) prior information specifying the second 
derivative. Besides, (c) uses additional information 
that fio{E) passes through a point immediately be- 
fore Eq. Solid line with filling — the envelope of 
x{k) (not weighted). Dashed lines — the noise es- 
timates from FT (nfc) and from Poisson counting 
statistics (np) (see Sec. III). 



assumption of uncorrelated knot positions, the mean-square deviation of Uj from the obtained through the fit optimal 
value yj equals 5{yj) = (|5^x^/9y|)~^/-^, where the partial derivatives are calculated in the fitting procedure at the 
minimum, cjm are assumed to be constant and equal to the root-mean-square average of p{r) between 15 and 25 A, 
where solely the noise is present. The errors Sj = S{yj)/[fio{Ej) — iih{Ej)] found imdcr such assumptions are shown 
in Fig. 4 as open circles with the solid line. Notice, that the ussumption that the knot positions are not correlated 
gives quite optimistic ej. Actually, several first knot positions appear to be highly correlated; the proper taking into 
account of the correlations (here wc do not present these calculations) raises sj at the least as twice. But even these 
underestimated Sj are appreciably larger than those given by the following method. 



C. Bayesian smooth curve 

Ideologically similar to the smoothing spline method is the method of bayesian smoothing (see Appendix on p. 11) 
proposed in the program VIPER. This method also finds the regularized function /xq, the regularizer a is the measure 
of compromise between smoothness of /io and its deviation from fi. In comparison with smoothing spline method, 
this method has some advantages, (i) Various prior information on can be considered, (ii) In this method the 
posterior distributions of all Hoj are sought for. From those distributions one can find not only average values but 
also any desirable momenta, which appears to be an additional difficiilty for other methods, (iii) In the framework of 
the method it is possible also to deconvolute /x with the monochromator rocking curve. The weakness of the method 
is its low speed (comparing with method II A, not with II B!). On a modern PC the curve drawn through N ^ 500 
points is smoothed for a few minutes. 

In Fig. 5 the bayesian smoothing was done on the mesh of 536 experimental points above Eq, without and with 
the prior function (its construction is described in Sec. II A). Besides, in the last case another information was 
used: the atomic-like absorption must coincide with the total absorption (minus pre-edge background) at energies 
E < Eq. Therefore, we demanded from the bayesian curve to pass through a point nearest (at left) to Eq. The 
values fiQj and S'^(poj) were found by formulas (A31) and (A33). Since the smoothed values do not lie within the 
limits of ±5(/zoj) from ^j, we did not look for the most probable smoothness (see. Appendix), instead we considered 
the regularizer to be known and equal to the optimal one found in the method II A. The introduction of the prior 
information has significantly diminished the errors of x(fc) extraction (sec dotted curves in Fig. 4) which were defined 
as Sj = (5(/Uoj)/ {p-Qj — fJ-bj)- This is quite natural: any decrease of our ignorance about jio should narrow the posterior 
distribution of fiQj for all j. Of course, this concerns the experimental information as well: errors ej are the less the 
more measured points A'^ the spectrum has. Comparing Fig. 1() and Fig. 5(c), it is seen practically perfect coincidence 
of the results of bayesian smoothing and smoothing spline. From this one can assume the equality of the errors which 
both methods give. 

Could we take into account possible systematic errors in the framework of the method? Yes, if we have the 
information on their nature and are able to translate it into the mathematics language; such a translation might be 
rather non-trivial. In any case, now we have the tool to extract from the prior and experimental information not only 
the sought values but their errors as well. 
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FIG. 5. Extraction of XAFS from the measured absorption using the bayesian smoothing. Prior function p{E) for the 
atomic- Uke absorption is drawn by dots. SoUd Unes — ^,o{E), x(fe) • A;^, and p(r) obtained with the use of the prior function; 

dashed hues — dittos without prior function. The dot hne on (c) is obtained without additional requirement for /io(-E) to pass 
through a point immediately before Eq. The regularizer a is the same for all cases and equals to the optimal one found for the 
smoothing spline. 



D. Other methods 



Consider briefly the methods for /xq construction not included into the VIPER program. 

A rich variety of computer programs for XAS spectra processing is collected on the International XAFS Society 
Web-site [10]. The vast majority of them use as an approximation for the atomic- like absorption a smoothing spline 
or more general piecewise-polynomial representation. For example, in the method of Ref. [11], the construction of 
/xq is divided into several stages: /lo is approximated by a low-degree polynomial, obtained x(fc) is multiplied by fc™, 
additional /Xg is drawn again as a low-degree polynomial and subtracted, a smoothing spline then approximates one 
more additional /Iq. The sum of all /xq's gives the total atomic- like background. The necessity of the preliminary 
stages was not discussed op. cit., however, clearly it was caused by the instability of spline with respect to the small 
variations of input parameters. And the point is not that the preliminary stages make the process stable, but that 
for each specific spectrum, auxiliary parameters (degrees of polynomials) could provide an acceptable construction 
of the atomic-like background. Above (in Sec. II A) we proposed the way to rise the stability of spline making the 
preliminary stages to be redundant. 

In Ref. [12] an iterative approach to "atomic background" removal was developed. First a spline is used to obtain 
a rough estimate of the background; this alone is enough to have a reliable x at fc > 5 — 6 A"-'^. Over that range the x 
obtained is fitted to the theoretical xth in r-space. The resulting fit parameters are used then to generate Xth(fc) that 
extends down to low k. This function is transformed back into e-space and /io is obtained as /xq = /i/(xth + 1) that 
need be a little smoothed or fitted by an additional spline. Since the logic of reasoning was inversed: not "find 
to find X," but "find x to find /zq," the method is suited for the quest of peculiarities on ^uq curve, not for structural 
XAFS-researches. Besides, the range of accuracy of the model appears to be unknown in principle: all, that is not 
described by the model, is included in /xq; the errors of the background approximation are also undefined. 

In the old work [13] for the determination of the backgroimd absorption /io was considered the damping of the 
XAFS amplitude resulting from the measurements with low resolutions (with a large slit width). The superimposition 
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of two spectra measured with different energy resolutions gives the intersection points, the part of which belong to 

the fiQ. Then through the obtained nodal points a smoothing spline is drawn. As the authors of Rcf. [13] noted, 
the measurements of the spectra with worsened resolution are not necessary; the spectra could be damped by the 
convolution with a "rocking curve," approximated by a Gaussian function. Of course, the method is correctly works 
only with a small variation of the Gaussian curve width since for the large width not only the XAFS amplitude is 
damped but the very edge is washed out. Because of this only the extended part of a spectrum could be reliably 
determined. 

The damping of the XAFS amplitude can be due to other reasons. For instance, as was pointed in Ref. [13], 
the nodal points may be obtained from the variable-temperature study. This idea was realized in Ref. [14] and is 
more sound since the atomic-like background is really independent of temperature and with temperature the XAFS 
amplitude is changed, not the shape of the edge. But for all that it is important that the phase difference between 
XAFS of different temperatures was negligible, which is true only for low wave numbers. Unfortunately, the method 
is suitable only for some particular cases (to say nothing of need for measured temperature series of spectra). Op. 
cit. it was demonstrated for the x-ray-absorption data for the edge of solid Pb. In those spectra the first crossing 
of jj, and /io occurs already at ^ 15 eV above edge. In our sample spectra the first crossing occurs only at ~ 30 eV, 
which allows one to find at most 2-3 points and the first of them being situated at fc > 2.5 A"-'^. 

An interesting approach to the problem of /kq determination was reported in Ref. [15]. It is based on the simple 
identity that relates the FT of some function with the FT of its n-th derivative: 

FT[/(")(/e)] = (2ir)"FT[/(fc)], (6) 

where the conjugate variables are k and 2r. Since the atomic-like background is smooth enough, the higher derivatives 
li'-^\k) {n > 2) are oscillatory near zero. Performing the FT of • fc™ and using Eq. (6), one obtains the FT 

of unnormalized would-be xi^) ' (see Fig. 6). Op. cit. the low-r part (which in our example is < r < 1.1) was 
cut off, and then the back FT was done. As a result, one has the unnormalized xi^) ■ fc™ and, having subtracted 
it from the /i(fc), the atomic-like background on which some peculiarities due to multi-electron excitations can be 
distinguished. Like the method of Ref. [12], this method is suited for the quest of peculiarities on the jiQ curve, not 
for structural XAFS-rcscarches because of evident distortion of the first peak on the FT by the contribution from the 
atomic- like background. To illustrate this assertion, in Fig. 6 we show the FT of the second derivative of the /Uo(fc) 
that was found by the present method. As seen, this contribution is not as small. 

If the electronic states of an absorbing atom in gaseous phase and in the compound of interest may be considered as 
equivalent, fio can be set equal to the measured absorption in gas, as was done in Ref [16] for solid, liquid, and gaseous 
Kr. Some differences in energy positions and relative weights of double-electron excitation channels were taken into 
account by a model using simple empirical functions which were transferred then to the spectra of liquid and solid 
Kr. Notice that the proposed in the present paper prior function for the methods of smoothing spline and bayesian 
smoothing can include additional items corresponding to the multi-electron contributions. 



III. ERRORS IN no CONSTRUCTING, NOISE, AND CHOICE OF LIMITS kmin femax 

For what we need to know the errors of XAFS-function extraction? First, without knowing of these values one 
cannot in principle aim at their minimization. Second, they are used in the definition of x^-statistics in the fitting 
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problems; their underestimation is a source of unjustified optimistic errors of fitting parameters. Third, along with 
analysis of the noise, the errors of /io construction allow us to choose the limits of reliable EXAFS signal, kmin and 

^max- 

Unfortunately, the issue of quality of XAFS extraction from the measured absorption has not been addressed 
properly. We see several reasons for that. On the one hand, not having a correctly developed approach to the 
estimation of the errors of final results (interatomic distances, Debye- Waller factors etc. found via fitting), the errors 
of EXAFS extraction are useful. On the other hand, only a few methods include approaches to their estimations. 

Easily one can compare the errors of different methods (sec Fig. 4) and then choose the most reliable one. The 
problem of plausible limitations on the absolute value of the errors is more difficult. Define "signal" as the envelope 
of x(^) (solid line with gray filling in Fig. 4). It is quite reasonable to demand that the errors of /zq construction 
were less than XAFS signal. For the method of the interpolation spline drawn through the varied knots to meet this 
requirement leads to the restriction on the photoelectron wave numbers: 2 < k < 14 A~^. For the bayesian curve a 
this range is < A: < 14 A"^, for the bayesian curves b and c this range is wider: < fc < 16 A~^. 

Another factor that limitates the spectrum length is the presence of noise. To determine the noise is a straightforward 
task for r-space, where XAFS signals at high r have clearly noise character. By Parseval's identity the noise in r-space 
is related with the noise in fc-space [17]: 

/ \nkk'^fdk = 2 / \nrfdr. (7) 

Substitute the mean value over the range 15 < r < 25 A of the FT magnitude squared for In^p. Then 

2 /I 2 IX I" 2w + 1 

% = (KI)^ 2^ + 1 2^+1 - (8) 
"'max "'min 

As seen from the formula, Uk depends on dk, the size of evenly-spaced fc-grid. Although above we already have 
used the Fourier transform, the question of choice of dk was not raised yet. The algorithm of fast FT needs the 
transformed function to be set on a uniform grid. Having chosen a small dk, we artificially obtain the large number 
of "experimental" values. Naturally, this trick would not give more information than we have, and the errors Uk 
must be large at the small dk. In our example the choice of dk (0.03 A~^) was based on the equality of numbers of 
experimental points and the nodes of the grid. The signal-to-noise ratio obtained is greater than unity for all the 
spectrum (see Fig. 4). There was no doubt in that: the signal is visually distinguished even for the very extended end 
of the spectrum (see Fig. 1(b) and Fig. 5(b)). 

The noise can be estimated based on the bayesian considerations [18]. Let after measurements we have the values 
of counts from the solid-state or gas-filled detectors and let there is a positive real number A such that the probability 
that a single count occurs in the time interval dt is 

P(1|A) = \dt. (9) 

It can be shown [19] that merely from this assumption follows that the counts obey the Poisson distribution law: 

W.r)=<a^|£tM, (10, 

where T is the sampling time. The problem is to find the intensity A and its variance. Using Bayes theorem and 
introducing prior probabilities P{N) = 1/N and -P(A) = 1/A [20], one obtains: 

P(X\N T) - P(N\X,T)P{X) _ T{XTr-^eM-m 

that is after measurement the variate 2TA follows the x^-distribution with 2N degrees of freedom. It is easy to find 
that A = N/T, A2 = N{N + 1)/T'^, and the variance of intensity is SX = y/N/T. 

Denote counts from detectors measuring Zq and ii as /g and Ii. By definition the variate ^ = follows 
Fisher's F-distribution with (2/o, 2/i) degrees of freedom. Its expected value and variance are known: ^ = /i/(/i — 1), 
(5^^ = 7^(7o +/i — 1)1 {{Ii — l)^(-fi — 2)/o), from where we find for the absorption in the fluorescence mode {^.x = io/ii): 

h{h+h-l) 



^oAi = l^v ^'^'''''^ = (^-1^-2) - ^^'^ 
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Further, the variate rj = |ln^ follows z-distribution (Fisher's distribution of variance ratio) with (2/o,2/i) degrees 
of freedom. Its expected value and variance are known: 77 = 0, S'^rj = + Ii)/{IoIi), from where we find for the 
absorption in the transmission mode (/ix = ln(io/ii)): 

ln(ioAi)=ln^, S^ln{io/h) = ^ + ^. (13) 
-'1 -10 -'1 

The noise of XAFS-function is 

„, = ^=(14)"^^. ,M, 

M0-M5 V-'o hJ M0-M6 

In our example this noise at A; > 15 becomes greater than signal (see Fig. 4). What is the reason for such 
significant difference between the really present noise and its statistical estimate rip? Of course, the reason is in 
the false premise (9). In practice this condition is realized as: P{c\X) = Xdt. For example, the photocurrent in an 
ion-chamber depends on gas pressure, potential applied etc.; these dependencies are contained in c. In other words, 
the amplification path works in such a way that one photon gives birth to c counts. There is no difficulty in writing 
the posterior distribution for the generalized premise: 



with A = N/(cT) and S\ ~ y^N/c /T. Thus, having unknown c (and implicitly assigning c = 1), we got wrong 
variances for io/ii and lii(io/ii). Unfortunately, in the most of real experiments the association between the probability 
of a single count event and the radiation intensity (via c) is unknown. In spite of this, the Poisson counting statistics 
is traditionally used for a long time. For example, in Ref. [21] signal-to-noise ratios are evaluated (assuming c = 1) 
for the different detection schemes. 

Practically all programs for XAFS spectra processing [10] to estimate the noise use the Fourier analysis. But then 
it is the noise that they use as uncertainties of x{k) determination in definition of x^-statistics: 

2 -^max [(Xexp)i (Xmod)i] ('\f,\ 

^ if • ^ > 

It would be more correct to consider as the larger from the two: the noise and the errors of the construction of 
fiQ- In our case (and as a rule) the latter are essentially greater (especially in the method II B) than the noise. In the 
following paper [2] we shall show how the understated £, lead to optimistic errors of structural parameters. 



IV. XAFS-FUNCTION CORRECTION 



Because of one reason or another the experimental XAFS might be distorted. Consider some of them. 

(i) Let the counts (/) from detectors are associated with the intensities (i) as io = kqIq and i\ = k\I\. Then the 
absorption (in the transmission mode) equals: 

jjtx = ln(io/zi) = ln(/o//i) -|- ln(xo/xi). (17) 

The second term is a slightly varied function of energy and can be taken into account in independent experiments. 
Such a distortion appears, for instance, if the absorptance of the gas in ion-chamber detectors depends on energy. 

(ii) If some part of incident radiation is not attenuated in the sample as much as expected (due to the pinholes in 
the sample, harmonics in the incoming beam etc.), that is = xq/q + then the real absorption is connected with the 
measured /q and Ii in a complicated way. In Ref. [22] the possible decrease of XAFS amplitude shown to be essential 
even at low b/ {xqIq) but thick samples. At known ratio b/ {hqIq)^ the correcting factor can be easily obtained. 

(iii) In the fluorescence mode, due to absorptance of the fluorescent signal in the sample itself XAFS spectra strongly 
depend on the detection geometry. In Ref. [23] the correcting functions are found explicitly. 

(iv) The problem of glitches is widespread in the XAFS analysis. The glitches are due to multiple Bragg reflection 
being satisfied simultaneously and for each given monochromator are manifested in the strictly determined spectral 
positions. In most cases the glitches seen on curves Iq{E) and Ii{E), vanish on Iq/Ii ratio. If not, one can easily 
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FIG. 7. Energy dependence of experimen- 
tal counts from ion-chambers. Curve Io{E) 
relates to the left axis, 7i {E) — to the right 
one. In glitch areas the absolute value of the 
derivative is greater than the critical level 
specified. On x{k) ■ only the glitch b 
is manifested. The displaced fragment is 
x{k) ■ after correction. 
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get rid of them. For instance, the glitch area, usually extremely thin, is smoothed or, with fixed ends, replaced by a 
straight-line segment. The main thing in the correct analysis of glitches is their detection. 

To detect a glitch on curves /i or xfc"" is practically impossible. For this, one needs the primary data Io{E) and 
Ii{E), not ln(/o//i) nor Iq/Ii- Out of glitches the intensity of incident radiation smoothly, ignoring the noise, depends 
on energy (see Fig. 7). The idea of detection of glitches via critical level for the derivative \d\nIo/dE\c is self-evident. 
For the presented in Fig. 7 /o(£') curve the absolute value of the derivative in the glitch areas is greater than the 
critical value chosen to be equal to 1.77- 10~^ Having extracted the XAFS, one can see that the first (paired) glitch 
a is not manifested on x(A;) • k'^, the last two (c d) are obscured by the noise, solely glitch b is clearly pronounced. 
Now, being in the firm belief that this is not a part of the XAFS, one can eliminate the glitch with ease. Here, we 
fixed its ends on ij,{E), replaced it by the straight-line segment, and constructed x(fc) • k"^ again. 

V. CONCLUSION 

In this paper we have considered all stages of XAFS function extraction from the measured absorption. We focused 
our attention on the most important stage, construction of the atomic-like absorption hq. 

For the wide-spread method of approximation of /io by a smoothing spline we have proposed the way to raise the 
stability by including the prior information about absorption edge shape ( "nearly step" or "nearly step with a white 
line"). Besides we have propose a new reliable criterion for determination of the smoothing regularizer. 

A new method for approximation of /io is proposed, the method of bayesian smoothing. It can include various prior 
information, which raises the accuracy of XAFS determination. Following this method one finds the distributions of 
fiQ in each experimental point, from which one can find not only average values but also any desirable momenta, which 
appears to be an additional difficulty for other methods. This method was shown to give more accurate atomic-like 
background than that obtained by the method of Ref [7] . 

Particular attention has been given to the analysis of noise. We have discussed the difficulties of its estimates on the 
basis of statistical approach. More reliable is the determination of noise from the Fourier transform. We have shown 
that the experimental noise is essentially less than the errors of no construction, and the use of values of noise in the 
X^-statistics definition appears to be erroneous since leads to the unjustified optimistic errors of structural parameters 
inferred in fitting procedures. For detailed consideration of the accuracy of fitting parameters see the following paper 
[2]. 
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APPENDIX: BAYESIAN SMOOTHING AND DECONVOLUTION 



1. Posterior distribution for smoothed data 

Consider general linear problem of data smoothing with the use of statistical methods (for introduction see review 
by Turchin et al. [24] and the articles from Web-site bayes . wustl . edu) . Let data d are defined on the mesh x\,. . . ,xn 
and consist of the true values t and the additive noise n: 

di=ti+ni, i = l,...,N. (Al) 

The problem of smoothing is to find the best estimates of t. For an arbitrary node j, find the probability density 
function for tj given the data d: 

P{tj\d)= j ■■■dti^j---P{t\d), (A2) 

where -P(t|d) is the joint probability density function for all values t, and the integration is done over all Ujtj. 
According to Bayes theorem, 

P(tia) = ™fi. (A3) 

P(t) being the joint prior probability for all ti, P{d) is a normalization constant. Assuming that the values rij are 
independent in diff'erent nodes and normally distributed with zero expected values, the probability P(d|t), so-called 
likelihood function, is given by 

P(d|t,a) = (27ra2)-^/2exp(-^^(4 (A4) 

fe=i 

where the standard deviation of the noise, a, appears as a known value. Later, we apply the rules of probability 
theory to remove a from the problem. 

Now define prior probability P(t). Let we know in advance that the function t{x) is smooth enough. To specify 
this information, introduce the norm of the second derivative and indicate its expected approximate value: 

n{t{x)) = J (^0^ dx « uj. (A5) 
Denote Aj = Xj+i — Xi, i = 1, . . . , N — 1 and represent the second derivative in the finit-difference form: 

N-l N 
i=2 k,l=l 

Clkl is a five-diagonal symmetric matrix with the following non-zero elements: 

(A7) 





= Af^A^^, ^22 = A2 


i(Ar' + A2-i)2 + A2'A3-i, O12 


= -(AiA2)-i(Ar^ 


+ A2-^), 




=A-\A-'+A-_\r + 










= -A-_\(Ar\ + Ari,) 


-(A,_iA,)-^(Ar\ + A-^), 








= Ar^Ar^, 










= AjY^^, 0;v-l,N"-l = 


■A-l,iA-l, + A-l,f + A-t2, 


^N-l,N = —Aj^^_ 


i(A-i_, + A-i_2) 



In order to introduce the minimum information in addition to that contained in (A6), from all normalized to unity 
functions P(t) which satisfy the condition (A6) we choose a single one that contains minimum information about t 
that is minimizes the functional 
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I[P{t)]= j P(t)lnP(t)dt + /3 I- j P{t)dt +j Lj~ J n{t)P{t)dt 

where (3 and 7 are the Larrange multipUcrs. In minimizing /[P(t)], one obtains the equation set 

lnP(t) + l-/3-7f7(t) = 
j P(t) dt = 1 

y Q(t) p(t) dt = u}, 

that has a solution: 

P(t) = (Ar...A.)-V^(^)-"^%xp(-^0(t)), 



(A8) 



(A9) 



(AlO) 



where a/2a^ = j = N/2u), and Ai, . . . , A^r are the eigenvalues of the matrix fl^i. The regularizer a will be used 
to control the smoothness of t. The prior distribution obtained is a "soft" one, that is does not demand from the 
solution to have a strictly prescribed form. 

Thus, we have for the probability density function: 



P{tj\d, a,a) oc j ■ 

-I 



■dt 



■dt 



ijij ■■■a 2^a^/2 exp(^-^ ^ i^kitktij exp(- - tfe)^) 

k,l=l k=l 

i^j ■ ■ ■ a-2^a^/2 exp (- _ [d^ - 2 ^ dktk + ^ Qkiikti] 



fe=i 



k,l=l 



where 



N 



Qki = a^ki +5ki, = X] ^k- 

k=l 

Since there is no integral over tj, separate it from the other integration variables: 

P{tj\d,a,a) (X c7-2^a^/2e3^pj^__l_[d2 _ 2djtj + gj^t^^jj 

X ■■■ dti^j • • • exp gutkti - 2 [dk - gkjtj]tk 



(All) 



(A12) 



k,l=l 



(A13) 



Here, the symbol j near the summation signs denotes the absence of i-th item. Further, find the eigenvalues A'^ 
and corresponding eigenvectors ej of the matrix g^i in which the j-th row and column are deleted, and change the 
variables: 



N 



N 



bi = \/A[y^-^ tkCik, tk = 



fe=l 



i=l 



Using the properties of eigenvectors: 

N 



N 



gikGik = A-eij, eikCik = 5ii {I, ij^j), 



(A14) 



(A15) 



fe=i 



fe=i 



one obtains: 



Pfeld, a, a) oc exp(-^[(d2 - h^) - 2tj{dj - hu) + t'j{gjj - u^)]) 

/■ 1 ^ 

X J ■■■dbijij^^^exp(^--^^^[bi- hi + Uitj]'^y (A16) 
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where new quantities were introduced: 

N ^ N 



V^i fe=i V\ fc=i 

N N N 

\,^ = Y,'hl vi^=YJ^l \m = Y,'hiUi. (A17) 

i=l i=l i=\ 

Evaluating the N — 1 integrals in (A16), one finally obtains the posterior probability for j-th node: 

P{tj\d,a,a) oc CT-(^+i)a^/2g^pJ^__l_[(d2 _ h^) - 2tj{dj - hu) +fj{gjj - u^)]). (A18) 



2. Eliminating nuisance parameters 

In most real problems a and a are not known. To eliminate cr is a quite straightforward problem: 

P{tj\d,a)= j daP{tj,a\d,a) = j daP{a)P{tj\d,a,a), (A19) 

one needs only to know a prior probability P{cr). Having no specific information about a, a Jeffreys prior P{cr) = 1/a 
is assigned [20]. Then 



(A20) 



P{tj\d, a) cc exp(-^[(d2 - h^) - 2t,(d,- - hu) + t^ig^^ - v?)]) 

cc [(d^ - h^) - 2tj{d, - hu) + t'jigj, - u2)]-(^+i)/^. 
Introducing the substitution 

^ (d2 - h^){gjj - u2) - {dj - hu)2 ^ gjj - J 
one obtains the Student f-distribution with N degrees of freedom: 

PK|d,a)cx (l + ^j (A22) 



with zero average and the variance N/ [N — 2). From where one finds for tji 



- hu , ^ (d2-h2)(g^-^--u2)-(rf, -hu)2 1 



Thus, we have got rid of unknown a and found the expressions for mean values tj and their dispersions at known 
regularizer a. To eliminate the latter is more difficult. The idea is not to find the smoothest solution, but the solution 
of the most probable smoothness. For that we will find the posterior probability: 

P(a|d) = j dtdaP{a, a, t\d) ^ j dtdaP{a,(j)P{t\a,a,d). (A24) 

Assuming that a and a are independent and using Bayes theorem (A3), one obtains: 

P(a|d) cx j dtdaP{a)P{a)P{t\a,a)P{d\t,a,a). (A25) 

Substituting (AlO) for the prior probability P{t\a,(T), (A4) for the likelihood, and a Jeffreys prior P{cr) = 1/a and 
P{a) = 1/a, one obtains the posterior distribution for the regularizer a: 
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N , N 



P(a|d)cx / dtda<7-2^-ia^/2-iexp(--^ ^ f^fe^tfct;) exp(--L ^(4 - tfc)') 

^ k,l=l ^ fe=l 

= / dtdCT<T-2^-ia^/2-iexp(-— [d^ - 2^dfeife + J2 9kitkti]), (A26) 

J! — 1 JL, r 1 



fe=l feJ=l 



where matrix gki was defined in (A12). After its diagonalization, analogously to what was done above, finally one 
obtains: 

P{a\d) oc (A'l • • • A';v)"'/'a'^/'"Md' - h^]-^/^, (A27) 

where is given by 

N I ^ 



/ \ 

i=l V -^i k=l 

and and e j are the eigenvalues and eigenvectors of gki ■ Having found the maximum of the posterior probability 
(A27) or having averaged over it the expression (A23), one has the sought t with the most probable smoothness. 
However it is necessary to point out that this procedure narrows the applicability of the bayesian smoothing down to 
the class of tasks where the smoothed values lie in most within the limits ±a from the most probable. In practice, 
there possible other tasks where the condition (Al) is treated more wider and the smoothed values exceed the bounds 
of noise. 



3. Expressions for smoothed values and their variances 

The formulas (A23) appear useless in practice since require to find the eigenvalues and eigenvectors for the matrix 
of rank A'^ — 1 on each node. Those formulas have merely a methodological value: the explicit expressions for 
posterior probabilities enable one to find the average of arbitrary function of tj. However, ij and S'^{tj) could be found 
significantly easier. Using (A19) and (All), represent tj as: 



P{a)P{tj\d,a,a)dtj 

N N 



OC / dtdaa 4^- exp(--i^ [d^ - 2^ dfeffe + dkitkU]). 

J I 1 U I — 1 



fe=i fe,;=i 

Performing the diagonalization, one obtains: 



from where 



N ^2 

OC 

i=l 



(A29) 



^ E ^ / ^-p(-^[d' - h^]) ' (A30) 



*".=E^- (A31) 



Analogously, for the variance S{tj) one has: 

6\t,) ^ J dhdaa-^^-' ^-p(-i[d' - ^^l) (E ^^^-^ )%xp(-^ ^[6. - h^) 

N 2 

^ ^ I daa-^-^ ew{-^[d' - h'])a\ (A32) 
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Normalizing, one finally obtains: 



J daa-^-^ exp (-[d2 - h2]/2a^) ^ 



r(f-l) {[d^-h^]/2f/'^el, _[d^-h^] 



([d2 _ h2]/2f /^-i r(f) ^-2 ^tTA: 

Now we got the usable formulas, which require to find the eigenvalues and eigenvectors for the matrix of rank A'' just 
one time. 



4. Addenda to the bayesian smoothing 

(i) Let the curvature of the function t{x) is approximately known in advance. To specify this information, introduce 
the norm of the difference between cPt/dx"^ and approximately known second derivative (Pf /dx^: 

r / d'^t d^f\'^ 

mx)) =j[d^-d^) dx^iv. (A34) 

Notice, that there is no need to know /(x) itself, its second derivative is sufficient. The explicit presence of f{x) in 
the following formulas should be taken as a consequence of the technical trick applied: at first f{x) is subtracted from 
the data, then it is added to the found solution. 

Everywhere in formulas (A6-A33) make the substitutions: 

ii = ti- fi, di = di- fi, i = l,...,N. (A35) 

Performing the described above procedure for smoothing, one finds U, from which by inverse transformation the 
sought vector is given by t = t + f . 

(ii) In some tasks the value on the starting (zero) node is known without measurement. This sort of prior information 

represents a "hard" one, that is it restricted the class of possible solutions; in the given case the solution must pass 
through the known zero node. The quadratic form i7(t) (or f2(t) in the case of approximately known second derivative) 
in the expression for the prior probability has changed: 

JV-l N 

0(t) = J2 - + ^i^) + ti+iAr^f = J2 ^kitkti + Clootl + 2Qoiioii + 2^02*0^2, (A36) 

i=l k,l=l 

the first few matrix elements of Clki now are: 

noo = Ao2A-i, rioi = -(AoAi)-i(Aoi + Ar'), ^02 = Ap ^Af", (A37) 
fin = Ari(Ao 1 + A^'f + A^'A^^, = -A^^{A^' + A^') - (AiA2)-i(Ari + A^'). 

If to = (or to =0), none further changes to the formulas of smoothing (A6 A33) are needed; at to 7^ the changes 
are evident: instead of the scalar product dt in (All) will be (d — d)t, where di = atoQoi, d2 = ato^02, all remaining 
di = 0; to the d^ the term atQ^oo will be added. 

(iii) Making some changes in the considered above problem of smoothing allows one to solve the problem of 
deconvolution. If the experimental value dj on some node j is determined not only by tj but also by the values of 
some neighboring nodes, then instead of (Al) we have: 

N 

di = '^rijtj +ni, i = l,...,N, (A38) 

where is the grid representation of the impulse response function. Instead of expression (A4), for the likelihood 
now we have: 

TV N 2 

P(d|t,a) = (27ra2)-^/2exp(-^^[dfc-^rifct,] ), (A39) 

fe=l i=l 
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and instead of (All), the posterior probability for tj is now expressed as: 

N N 

P{tj\d,a,a)^ / ■■■dti^j---a-^^a''^^e^p(^-^[d^ -2Y,Dktk+ ^ GkitkU]), (A40) 

fe=l k,l=l 

where 

JV JV 

Gki = aQki + X] '^ikru, -Cfe = ^ nkdi. (A41) 

i=l i=l 

Further steps for finding of t are analogous to the described above. 
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